AITopics | debiased self-training

Debiased Self-Training for Semi-Supervised Learning

Neural Information Processing SystemsFeb-12-2026, 01:46:47 GMT

Despite its popularity, self-training is well-believed to be unreliable and often leads to training instability.

artificial intelligence, machine learning, pseudo label, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
Asia > China > Guangxi Province > Nanning (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)

Add feedback

Debiased Self-Training for Semi-Supervised Learning

Neural Information Processing SystemsDec-25-2025, 08:51:22 GMT

Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets. Yet these datasets are time-consuming and labor-exhaustive to obtain on realistic tasks. To mitigate the requirement for labeled data, self-training is widely used in semi-supervised learning by iteratively assigning pseudo labels to unlabeled samples. Despite its popularity, self-training is well-believed to be unreliable and often leads to training instability. Our experimental studies further reveal that the bias in semi-supervised learning arises from both the problem itself and the inappropriate training with potentially incorrect pseudo labels, which accumulates the error in the iterative self-training process.

debiased self-training, name change, pseudo label, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

Supplemental Material: Debiased Self-Training for Semi-Supervised Learning

Neural Information Processing SystemsAug-19-2025, 03:44:48 GMT

We further explore the training stability of FixMatch when using pre-trained models on various tasks. Figure 2 illustrates several failure cases of FixMatch.

artificial intelligence, machine learning, pseudo label, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Debiased Self-Training for Semi-Supervised Learning

Neural Information Processing SystemsAug-19-2025, 03:44:44 GMT

Despite its popularity, self-training is well-believed to be unreliable and often leads to training instability.

artificial intelligence, machine learning, pseudo label, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
Asia > China > Guangxi Province > Nanning (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)

Add feedback

IceBerg: Debiased Self-Training for Class-Imbalanced Node Classification

Li, Zhixun, Chen, Dingshuo, Zhao, Tong, Wang, Daixin, Liu, Hongrui, Zhang, Zhiqiang, Zhou, Jun, Yu, Jeffrey Xu

arXiv.org Artificial IntelligenceFeb-10-2025

Graph Neural Networks (GNNs) have achieved great success in dealing with non-Euclidean graph-structured data and have been widely deployed in many real-world applications. However, their effectiveness is often jeopardized under class-imbalanced training sets. Most existing studies have analyzed class-imbalanced node classification from a supervised learning perspective, but they do not fully utilize the large number of unlabeled nodes in semi-supervised scenarios. We claim that the supervised signal is just the tip of the iceberg and a large number of unlabeled nodes have not yet been effectively utilized. In this work, we propose IceBerg, a debiased self-training framework to address the class-imbalanced and few-shot challenges for GNNs at the same time. Specifically, to figure out the Matthew effect and label distribution shift in self-training, we propose Double Balancing, which can largely improve the performance of existing baselines with just a few lines of code as a simple plug-and-play module. Secondly, to enhance the long-range propagation capability of GNNs, we disentangle the propagation and transformation operations of GNNs. Therefore, the weak supervision signals can propagate more effectively to address the few-shot issue. In summary, we find that leveraging unlabeled nodes can significantly enhance the performance of GNNs in class-imbalanced and few-shot scenarios, and even small, surgical modifications can lead to substantial performance improvements. Systematic experiments on benchmark datasets show that our method can deliver considerable performance gain over existing class-imbalanced node classification baselines. Additionally, due to IceBerg's outstanding ability to leverage unsupervised signals, it also achieves state-of-the-art results in few-shot node classification scenarios. The code of IceBerg is available at: https://github.com/ZhixunLEE/IceBerg.

artificial intelligence, machine learning, pseudo label, (16 more...)

arXiv.org Artificial Intelligence

2502.0628

Country:

Oceania > Australia > New South Wales > Sydney (0.05)
Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Debiased Self-Training for Semi-Supervised Learning

Neural Information Processing SystemsJan-18-2025, 23:19:46 GMT

Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets. Yet these datasets are time-consuming and labor-exhaustive to obtain on realistic tasks. To mitigate the requirement for labeled data, self-training is widely used in semi-supervised learning by iteratively assigning pseudo labels to unlabeled samples. Despite its popularity, self-training is well-believed to be unreliable and often leads to training instability. Our experimental studies further reveal that the bias in semi-supervised learning arises from both the problem itself and the inappropriate training with potentially incorrect pseudo labels, which accumulates the error in the iterative self-training process.

debiased self-training, pseudo label, semi-supervised learning, (4 more...)

Neural Information Processing Systems

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

Debiased Self-Training for Semi-Supervised Learning

Chen, Baixu, Jiang, Junguang, Wang, Ximei, Wan, Pengfei, Wang, Jianmin, Long, Mingsheng

arXiv.org Artificial IntelligenceNov-9-2022

Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets. Yet these datasets are time-consuming and labor-exhaustive to obtain on realistic tasks. To mitigate the requirement for labeled data, self-training is widely used in semi-supervised learning by iteratively assigning pseudo labels to unlabeled samples. Despite its popularity, self-training is well-believed to be unreliable and often leads to training instability. Our experimental studies further reveal that the bias in semi-supervised learning arises from both the problem itself and the inappropriate training with potentially incorrect pseudo labels, which accumulates the error in the iterative self-training process. To reduce the above bias, we propose Debiased Self-Training (DST). First, the generation and utilization of pseudo labels are decoupled by two parameter-independent classifier heads to avoid direct error accumulation. Second, we estimate the worst case of self-training bias, where the pseudo labeling function is accurate on labeled samples, yet makes as many mistakes as possible on unlabeled samples. We then adversarially optimize the representations to improve the quality of pseudo labels by avoiding the worst case. Extensive experiments justify that DST achieves an average improvement of 6.3% against state-of-the-art methods on standard semi-supervised learning benchmark datasets and 18.9%$ against FixMatch on 13 diverse tasks. Furthermore, DST can be seamlessly adapted to other self-training methods and help stabilize their training and balance performance across classes in both cases of training from scratch and finetuning from pre-trained models.

artificial intelligence, machine learning, pseudo label, (17 more...)

arXiv.org Artificial Intelligence

2202.07136

Country: